Task-Similarity-Based VNF Aggregation for Air–Ground Integrated Networks

In a harsh environment, function aggregation of air–ground integrated network service function chaining (SFC) deployment can easily cause network load imbalance, which affects the network security and reliability. In this study, a task-similarity-based virtual network function (VNF) aggregation scheme was proposed. It considered air–ground network resource consumption and load balance before SFC mapping. A model for selecting VNFs to be aggregated based on task similarity was built. The tasks were classified based on their similarity. Furthermore, the VNFs to be aggregated were selected within the class under the constraints of the underlying physical resources. Load balancing was achieved by adjusting the similarity threshold. Moreover, an SFC mapping selection scheme based on network resource awareness was used to obtain the most suitable physical nodes for single-chain and multi-chain mapping according to various attributes of physical network nodes. The simulation results indicated that the proposed scheme with a better load balance design outperformed existing works on VNF aggregation. We also demonstrated that the task-similarity-based scheme was resource-consumption efficient and effective.


Introduction
The air-ground integrated network makes up for the shortcomings of the independent ground network and the air network composed of drones. Network integration provides seamless global connections and highspeed computing capabilities [1] to satisfy the service quality requirements of some harsh environments, such as those experienced when searching forests, doing reconnaissance, jamming and surveying trenches. However, the rapid deployment of new tasks has led to a rapid increase in the amount of hardware and caused further problems such as high operating costs, insufficient resource utilization and inflexible management [2]. Network function virtualization (NFV) reduces the operating cost and improves the reliability of network deployment [3] by decoupling hardware devices and the network functions running on them [4].
In the NFV network, the required functional requirements are fulfilled by instantiating the virtual network function (VNF). To improve resource utilization, different studies adopted different function aggregation methods. For example, in [5], the VNF that first appeared in the SFC in the terrestrial-satellite hybrid cloud network was used to obtain its optimal deployment position using linear planning. When the same VNF appeared in the subsequent service, the requested SFC was merged with the SFC with the same VNF running in the cloud and processed as new resource consumption. If the mapping is successful, the merge will deploy to the new cloud for processing. Otherwise, the optimal deployment plan obtained according to linear planning will be directly deployed in the cloud. In [6], the concept of an aggregation rate was proposed in the air-space-ground integration network for service reconfiguration based on the service function chain. It was R =< F, E >, where R = {r|r = 1, 2, . . . , |R|} represents the service request that arrives within time T. The source and destination nodes of each service request are represented by {s r |r ∈ R} and {d r |r ∈ R}, respectively. The node of the service request is the VNF, and the link is the dependency among the VNFs. There are k types of VNFs required by the service, including bandwidth requirements {B r |r ∈ R}, computing power requirements {C r |C r > 0.r ∈ R} and the delay deadline {D r |r ∈ R}. Each service request r consists of a set of VNFs { f 1 , f 2 . . . f i |i ∈ F}, and the link E r = {(p, q)|p, q ∈ F, r ∈ R} between the VNFs needs to follow the dependencies between virtual network functions.
Consumption model: To visually express the SFC mapping and to make calculations more convenient, the model variables were defined in matrix form. The mapping of unknown vertices X ψ and the mapping of links Y ψ were also represented by a binary matrix. Define x r f ,a as a virtual network function deployment variable for task r, and when its value is 1, this means that the virtual network function is deployed on the physical node a. ∏ = {∏ ab }, ∀a, b ∈ N represents the collection of all paths in the network nodes. Define the virtual link mapping variable y r pq,ab , which has a value of 1 when the link between two adjacent virtual network functions p and q in task r is mapped to the underlying physical link ∏ ab ; otherwise, y r pq,ab has the value 0. We defined a virtual network function mapping binary matrix X for all possible vertex mappings.
Similarly, we defined a binary matrix Y of link mappings that represents all possible link mappings.
|E| · | ∏ | = |F| · (|F| − 1) 2 · |N| 2 (2) y r pq,ab |E|·|∏| is the binary matrix Y of all possible link mappings. For each successfully deployed SFC, the following constraints must be met: Uniqueness constraint: a virtual network function of a task can only be embedded on one physical node, and a virtual link of a task can only be mapped to a unique physical link.
Bandwidth constraint: each path in the routing path must have enough bandwidth to meet the requirements of the virtual link.
Computing power constraint: the physical node needs to have enough computing resources to handle the data traffic carried by the SFC.
Compatibility constraints: Each virtual network function can only be mapped to a physical node capable of processing this function, and a virtual link can only be mapped to a physical link with sufficient bandwidth resources, that is, the bandwidth resources of the physical link must meet the bandwidth resources required by the service request.
∀E pd ∈ E, ∀A ac ∈ A)y r pd,ac = 1, B pd ≤ B ac 0, otherwise The function type determines the type of VNF and can only be mapped when the type of VNF and the type carried by the physical node are the same.
We used the load-balancing index α to measure the effect of network load balancing, where the formula is where ω n k represents the k-type resource occupancy rate of node n and ω represents the average k-type resource occupancy rate of each physical node.
Since the computational consumption of the processing VNF is fixed, the resource consumption requested by the task in this article includes instantiation consumption and bandwidth consumption. However, air-ground integrated network nodes are different in terms of coverage and computing power. Therefore, their instantiation overhead and bandwidth consumption are different.  (11) γ f ,a represents the instantiation overhead of mapping function f to physical node a in the air network, and ε f ,a represents the instantiation overhead of mapping function f to physical node a in the ground network. D pq,ab represents the actual bandwidth requirement for mapping virtual link pq to physical link ab in the air network, B pq,ab represents the actual bandwidth requirement for mapping virtual link pq to physical link ab in the ground network and Q pq,ab represents the actual bandwidth requirement for mapping virtual link pq to physical link ab in the air-ground links.

Selection Method of VNFs to Be Aggregated Based on Task Similarity
If each VNF is instantiated for each SFC, it will cause a huge instantiation overhead and waste of resources when deploying the service function chain. VNF aggregation effectively solves this problem. However, the existing function aggregation methods only focus on reducing the instantiation overhead and ignoring the load balancing problem caused by VNF aggregation. To solve the problem of load balance, Refs. [9,10,12,13] adopted VNF migration and solved it using SFC reconstruction, which seriously affected the stability of the link.
As shown in Figure 1a, when airship 3 communicates with ship 1, the link indicated by the dashed line is selected to complete the communication. When airship 1 also receives the task of communicating with ship 1, due to the independence of the tasks, it chooses the link represented by the solid line to communicate. This not only causes a waste of network resources but also makes the links between nodes repeatedly disconnected, causing a network server instability. If the routing path of airship 1 can be shown as the solid line linked in Figure 1b according to the similarity of the tasks, this problem will be solved.

Selection Method of VNFs to Be Aggregated Based on Task Similarity
If each VNF is instantiated for each SFC, it will cause a huge instantiation overhead and waste of resources when deploying the service function chain. VNF aggregation effectively solves this problem. However, the existing function aggregation methods only focus on reducing the instantiation overhead and ignoring the load balancing problem caused by VNF aggregation. To solve the problem of load balance, Refs. [9,10,12,13] adopted VNF migration and solved it using SFC reconstruction, which seriously affected the stability of the link.
As shown in Figure 1a, when airship 3 communicates with ship 1, the link indicated by the dashed line is selected to complete the communication. When airship 1 also receives the task of communicating with ship 1, due to the independence of the tasks, it chooses the link represented by the solid line to communicate. This not only causes a waste of network resources but also makes the links between nodes repeatedly disconnected, causing a network server instability. If the routing path of airship 1 can be shown as the solid line linked in Figure 1b according to the similarity of the tasks, this problem will be solved.  In response to the above problems, this study proposed a VNF selection method to aggregate tasks based on task similarity.
The core idea of the method is as follows: (1) For tasks that arrive in time interval T, the virtual network functions required by the task and the access to task resources that are more similar are placed into the same category. Tasks within the same class share the same type of physical node. (2) In the same category, the same VNF is selected for functional aggregation under the condition of satisfying bandwidth and computing constraints. (3) By adjusting the similarity threshold, there is a trade-off between resource consumption and load balancing.
The specific implementation is as follows: Task similarity is divided into functional similarity and resource similarity. Task similarity within the time interval T is considered in order to divide similar tasks into the same class to share physical network nodes with the same functionality. On the one hand, this can reduce the instantiation overhead of the same virtual network function for similar tasks, reduce the global waiting time and maintain network stability. On the other hand, it can reduce the uneven load caused by all tasks sharing the same physical node for the same virtual network function. However, the task similarity division according to the time interval T cannot completely solve the problem of an uneven load, and thus, the task similarity threshold can be adjusted to balance the resource consumption and load balancing to meet the task requirements in different scenarios.
The definition of task similarity in the time interval T is given here, which includes virtual network function similarity and resource similarity.
Virtual network function similarity refers to the similarity of VNFs by different tasks.
Tsim(r i , r j ) is the similarity of the virtual network functions of the two tasks r i and r j . typequal(r i, r j ) is the number of virtual network functions of the same type contained in tasks r i and r j . totaltype(r i ) is the total number of virtual network functions in service request r i and precent i is the proportion of the number of virtual network functions of the same virtual network functions in task r i to the total number of virtual network functions.
When t tasks arrive, the system first calculates the similarity of the pairwise tasks and then divides it by the number of pairwise permutations and combinations to obtain the task similarity between t tasks.
When t tasks arrive, the virtual network function similarity of the tasks is Resource similarity refers to the similarity of the actual demand for the same virtual network function resources.
where r f i represents the resources that task r contains for virtual network function f i that needs to be accessed.
Ssim( f i , f j ) represents the similarity of resources required for the same virtual network function in tasks r i and r j . ω il and ω il are the actual demand for the lth resource of the same virtual network function f for tasks r i and r j , respectively.
When t tasks arrive, the resource similarity is The task similarity is where γ and ξ are weights. After classification according to task similarity, the system judges whether the aggregation can be performed by whether the bandwidth and computing resources meet the physical network constraints. The aggregation can be performed when the constraints are met; otherwise, it is achieved by adjusting the similarity. Each time, the network resource status is updated after a division by task similarity. For load balancing, this study took an approach to determine task classification by presetting similarity thresholds. While ensuring that the resource consumption is reduced, load balancing is achieved by adjusting the similarity threshold to avoid too many tasks waiting for the same physical node to meet the requirements of the tasks.

SFC Mapping Node Selection Algorithm Based on Resource Awareness
In order to complete the user's service request, after the task that arrives in time T is classified by task similarity, the task is deployed on the appropriate physical node through the SFC mapping algorithm. The algorithm we proposed was based on the awareness of network resources, which includes the remaining resources of the link bandwidth and the usage of computing resources of nodes. We adopt an improved genetic algorithm to select the mapped physical network node through the awareness of network resources. Different SFC construction schemes will cause different mapping results of the service function chain, and different mapping results will also be obtained by selecting different physical nodes for mapping. Compared with searching for a single VNF separately or mapping using graph theory, the feature of genetic algorithm exploring from a string set reduces the complexity of the algorithm. At the same time, a genetic algorithm can easily realize parallelization. It can handle multiple individuals in the group and simultaneously deal with different construction schemes of SFC. However, the efficiency of a single genetic algorithm is low, and it can easily converge prematurely. A particle swarm has the advantages of fast calculation speed and solid global searchability; therefore, this study proposed a genetic-particle swarm optimization (GA-PSO) algorithm.
The particle swarm algorithm is added to the genetic algorithm as the genetic parent chromosome. The speed and position update of the particle swarm is replaced by the crossover and mutation of the genetic algorithm. For single-chain and multi-chain deployments, different coding, crossover and mutation methods are used to implement the deployment of service function chains.
(1) Chromosome coding Single-chain individual mapping coding: When performing single-chain mapping, SFCs in the same category are deployed one after the other in the order of arrival. That is, within the same category, for the VNF that has been instantiated, the same VNF of the subsequent SFC is deployed on the same physical node, as shown in Figure 2. For the service function chain that satisfies the dependency, the shortest path method is used to calculate the path according to the service function chain mapping code. Single-chain individual mapping coding: When performing sin SFCs in the same category are deployed one after the other in the orde within the same category, for the VNF that has been instantiated, th subsequent SFC is deployed on the same physical node, as shown i service function chain that satisfies the dependency, the shortest path calculate the path according to the service function chain mapping cod Multi-chain and multi-encoding: In the case of multi-chain ma same class are deployed simultaneously, multiple service function ch chromosome and multiple SFC mapping schemes in the class are multi VNF in the same category is mapped on the same physical node, as sho shortest path method is used to calculate the shortest path according location of the aggregation function. Multi-chain and multi-encoding: In the case of multi-chain mapping, SFCs in the same class are deployed simultaneously, multiple service function chains are coded as a chromosome and multiple SFC mapping schemes in the class are multi-encoded. The same VNF in the same category is mapped on the same physical node, as shown in Figure 3. The shortest path method is used to calculate the shortest path according to the deployment location of the aggregation function.
(2) Fitness function Multi-chain and multi-encoding: In the case of multi-chain mappin same class are deployed simultaneously, multiple service function chains chromosome and multiple SFC mapping schemes in the class are multi-enco VNF in the same category is mapped on the same physical node, as shown in shortest path method is used to calculate the shortest path according to th location of the aggregation function. (2) Fitness function Single-chain: When performing single-chain mapping, SFCs in the s are deployed one after the other in the reference order. After each service is deployed, the subsequent service function chains realize function sharin VNF. Therefore, when deploying a single chain, the resource consumptio move the overhead saved after aggregation. It includes the reduced insta head of the air nodes and ground nodes.

Multi-chain:
In multi-chain deployment, SFCs in the same category simultaneously. First, we select the optimal deployment location of the ag and then find the deployment location of other VNFs. Therefore, when m are deployed simultaneously, the changes in bandwidth resources afte should be considered. At the same time, the resource consumption needs to from the overhead saved after aggregation, which includes the reduced insta head of the air nodes and ground nodes.   Single-chain: When performing single-chain mapping, SFCs in the same category are deployed one after the other in the reference order. After each service function chain is deployed, the subsequent service function chains realize function sharing for the same VNF. Therefore, when deploying a single chain, the resource consumption needs to remove the overhead saved after aggregation. It includes the reduced instantiation overhead of the air nodes and ground nodes. ψ f and ϕ f represent the number of instantiations that are reduced by the spatial and ground nodes after the function sharing function f .

Multi-chain:
In multi-chain deployment, SFCs in the same category are deployed simultaneously. First, we select the optimal deployment location of the aggregated VNF and then find the deployment location of other VNFs. Therefore, when multiple chains are deployed simultaneously, the changes in bandwidth resources after aggregation should be considered. At the same time, the resource consumption needs to be subtracted from the overhead saved after aggregation, which includes the reduced instantiation overhead of the air nodes and ground nodes. (

3) Chromosome crossover and mutation
The chromosome crossover was divided into two parts, namely, optimal crossover and free crossover. The optimal crossover is the crossover between the optimal chromosome and the common chromosome. It includes not only the crossover between the individual optimal chromosome and the common chromosome but also the crossover between the group optimal chromosome and the common chromosome. Free crossover avoids the fast convergence of the algorithm and crossovers the chromosomes obtained by the optimal crossover with common chromosomes. When the crossover is performed, the crossover bit is randomly selected for replacement. In the crossover operation, it is necessary to satisfy the conflict detection, that is, to satisfy the uniqueness constraint and the compatibility constraint.
When a chromosome is mutated, the mutation position is randomly selected. Then, a node is randomly selected from the set of physical nodes carrying the same VNF and is replaced at the mutated location.
When performing the crossover and mutation operations, it is necessary to satisfy the conflict detection, that is, to satisfy the uniqueness constraint and the compatibility constraint.

Experimental Setup
To verify the performance of the algorithm, we used MATLAB 2018a software to simulate it on a computer, which was configured with an 8 GB Intel Core i5-4210U CPU. The simulation used the network topology on SNDlib [14], including 25 points and 45 edges. The point with the most robust connectivity was selected as the UAV space network node, and the other nodes were used as the ground network topology. In the experiment, the direction of the topological edge was undirected, and the bandwidth was 500 Mb/s. Five VNF types were selected in the experiment, and each node randomly selected two to three types as the VNF types that the node can bear. Each service request contained at least three VNFs and at most five VNFs. The minimum similarity of the tasks was 0.3, and the source node and destination node of the task were randomly selected.

Experimental Analysis
First, in order to verify the effectiveness and convergence of the GA-PSO algorithm proposed in this study, we randomly selected the task request and set the task request bandwidth to 20 M/s, the population number to 20, the number of iterations to 100 and an average of 200 experimental times. The results are shown in Figure 2.
It can be seen from Figure 4 that the GA-PSO algorithm proposed in this study was superior to the GA algorithm in terms of the convergence speed and fitness value.

Experimental Setup
To verify the performance of the algorithm, we used MATLAB 2018a simulate it on a computer, which was configured with an 8 GB Intel Core i5-The simulation used the network topology on SNDlib [14], including 25 po edges. The point with the most robust connectivity was selected as the UAV work node, and the other nodes were used as the ground network topology. I iment, the direction of the topological edge was undirected, and the bandwi Mb/s. Five VNF types were selected in the experiment, and each node random two to three types as the VNF types that the node can bear. Each service reque at least three VNFs and at most five VNFs. The minimum similarity of the ta and the source node and destination node of the task were randomly selected

Experimental Analysis
First, in order to verify the effectiveness and convergence of the GA-PSO proposed in this study, we randomly selected the task request and set the t bandwidth to 20 M/s, the population number to 20, the number of iterations to average of 200 experimental times. The results are shown in Figure 2.
It can be seen from Figure 4 that the GA-PSO algorithm proposed in thi superior to the GA algorithm in terms of the convergence speed and fitness v   Figures 5 and 6 compare the load-balancing index and resource consumption of the proposed algorithm with the GA algorithm, greedy algorithm and RANDOM algorithm when given different tasks. As can be seen from Figures 5 and 6, with the gradual increase in the number of tasks, this study proposed that the performance of the algorithm should always be better than other algorithms. This is because the RANDOM algorithm does not have any optimization strategy, but only randomly deploys SFC. The greedy algorithm only focuses on finding the optimal solution to the current subproblem, that is, only focusing on the location of the next virtual network function deployment, not on the consumption and load of the overall SFC deployment. The traditional GA algorithm has the defect of converging too fast and can easily fall into a local optimum, while the algorithm proposed in this study adds a particle swarm algorithm to the traditional genetic algorithm and improves the method of cross-mutation to give it better convergence and optimization; therefore, whether the system is load balancing or managing resource consumption, the performance of this study's algorithm presents certain advantages. sumption and load of the overall SFC deployment. The traditional GA algorithm has the defect of converging too fast and can easily fall into a local optimum, while the algorithm proposed in this study adds a particle swarm algorithm to the traditional genetic algorithm and improves the method of cross-mutation to give it better convergence and optimization; therefore, whether the system is load balancing or managing resource consumption, the performance of this study's algorithm presents certain advantages.   Figure 7 shows the load-balancing index comparison of the single-chain deployment method proposed in this study with the different similarity thresholds, the GA algorithm with a similarity threshold of 0.3 and the randomly deployed RANDOM algorithm given the different number of tasks. Since the minimum similarity between tasks in this study defect of converging too fast and can easily fall into a local optimum, while the algorithm proposed in this study adds a particle swarm algorithm to the traditional genetic algorithm and improves the method of cross-mutation to give it better convergence and optimization; therefore, whether the system is load balancing or managing resource consumption, the performance of this study's algorithm presents certain advantages.   Figure 7 shows the load-balancing index comparison of the single-chain deployment method proposed in this study with the different similarity thresholds, the GA algorithm with a similarity threshold of 0.3 and the randomly deployed RANDOM algorithm given the different number of tasks. Since the minimum similarity between tasks in this study  Figure 7 shows the load-balancing index comparison of the single-chain deployment method proposed in this study with the different similarity thresholds, the GA algorithm with a similarity threshold of 0.3 and the randomly deployed RANDOM algorithm given the different number of tasks. Since the minimum similarity between tasks in this study was 0.3, when the similarity threshold was 0.3, all the same VNFs were aggregated, that is, the traditional aggregation method was used. Compared with the genetic algorithm that uses functional aggregation for all the same VNFs, the load-balancing index of the GA-PSO algorithm given the same number of tasks was slightly lower than that of the GA algorithm because the GA-PSO algorithm had better optimization performance and convergence. The load-balancing index of the RANDOM algorithm was lower than the deployment when the similarity threshold was 0.3 because the RANDOM algorithm randomly deployed the VNFs in the service function chaining. Compared with aggregating all the same VNFs, the deployment of RANDOM was more random. It will not deploy all the same VNFs on the same physical node for queuing. Therefore, the load-balancing index of the RANDOM algorithm was lower than the GA-PSO deployment when the similarity threshold was 0.3. When the similarity threshold was 0.7, the load-balancing index reached the lowest value. This was because when the similarity threshold was smaller, the aggregation degree of the VNFs was higher, which caused an excessive load on the physical nodes. When the similarity threshold was higher, the aggregation degree of the VNFs was lower. The deployment of SFC mainly relies on resource consumption, and thus, the load balancing at this time was not necessarily the lowest value. Therefore, when the similarity threshold was 0.7, the load-balancing index had the lowest value.
reached the lowest value. This was because when the similarity threshold w the aggregation degree of the VNFs was higher, which caused an excessive physical nodes. When the similarity threshold was higher, the aggregation de VNFs was lower. The deployment of SFC mainly relies on resource consum thus, the load balancing at this time was not necessarily the lowest value. There the similarity threshold was 0.7, the load-balancing index had the lowest valu  Figure 8 shows the resource consumption comparison of the single-cha ment method proposed in this study with the different similarity thresholds, gorithm with a similarity threshold of 0.3 and the randomly deployed RAN rithm with different numbers of tasks. When the similarity threshold was 0.3, t consumption of the GA algorithm was higher than that of GA-PSO under the ber of tasks. When the same aggregation method was used, the GA-PSO alg better optimization performance and convergence. It can be seen from the fig the similarity threshold gradually increased, resource consumption also increa the task similarity threshold gradually increased, the number of aggregated V ued to decrease. As a result, more and more VNFs needed to be instantiated. T in instantiation overhead led to an increase in resource consumption. The RA gorithm had the highest resource consumption due to its deployment withou mization strategy.  Figure 8 shows the resource consumption comparison of the single-chain deployment method proposed in this study with the different similarity thresholds, the GA algorithm with a similarity threshold of 0.3 and the randomly deployed RANDOM algorithm with different numbers of tasks. When the similarity threshold was 0.3, the resource consumption of the GA algorithm was higher than that of GA-PSO under the same number of tasks. When the same aggregation method was used, the GA-PSO algorithm had better optimization performance and convergence. It can be seen from the figure that as the similarity threshold gradually increased, resource consumption also increased. When the task similarity threshold gradually increased, the number of aggregated VNFs continued to decrease. As a result, more and more VNFs needed to be instantiated. The increase in instantiation overhead led to an increase in resource consumption. The RANDOM algorithm had the highest resource consumption due to its deployment without any optimization strategy.   Figure 9 shows the comparison of single-chain and multi-chain load-balancing indexes with different similarity thresholds. For a single chain, as the similarity threshold increased, its load-balancing index had the lowest value at 0.7. For a multi-chain, its load-balancing index always showed a downward trend. Due to the different deployment methods of single-chain and multi-chain, their changing trends were different as the similarity threshold increased. The single-chain deployment method deploys SFCs in the same category in a reference order. After each service function chaining was deployed, the subsequent SFCs implement function sharing occurred for the same VNF. As the similarity threshold increased, the deployment of each SFC depends mostly on resource consumption, resulting in that load balancing at that time not necessarily being at the lowest value, and thus, the lowest value appeared at 0.7. The multi-chain deployment method deployed SFCs in the same category simultaneously. First, the system selected the optimal deployment location of the aggregated VNF and then found the deployment location of other VNFs. As the similarity threshold increased, multiple SFCs found the optimal path together and reduced the dependence of a single SFC on resource consumption, and thus, the loadbalancing index was on a downward trend. As the similarity threshold increased, except when the similarity threshold was 0.7, the load-balancing index of multi-chain deployment was slightly higher than that of a single chain. Under other similarity threshold divisions, the load-balancing index of the multi-chain system was lower than that of the single-chain system. Therefore, to reduce the load balance, multi-chain deployment was better than single-chain deployment in most cases. Figure 9 shows the comparison of single-chain and multi-chain loaddexes with different similarity thresholds. For a single chain, as the similar increased, its load-balancing index had the lowest value at 0.7. For a multi-ch balancing index always showed a downward trend. Due to the different methods of single-chain and multi-chain, their changing trends were differe ilarity threshold increased. The single-chain deployment method deploys same category in a reference order. After each service function chaining w the subsequent SFCs implement function sharing occurred for the same VNF ilarity threshold increased, the deployment of each SFC depends mostly on sumption, resulting in that load balancing at that time not necessarily being value, and thus, the lowest value appeared at 0.7. The multi-chain deploym deployed SFCs in the same category simultaneously. First, the system selecte deployment location of the aggregated VNF and then found the deployme other VNFs. As the similarity threshold increased, multiple SFCs found the together and reduced the dependence of a single SFC on resource consumpti the load-balancing index was on a downward trend. As the similarity thresho except when the similarity threshold was 0.7, the load-balancing index of m ployment was slightly higher than that of a single chain. Under other similar divisions, the load-balancing index of the multi-chain system was lower th single-chain system. Therefore, to reduce the load balance, multi-chain dep better than single-chain deployment in most cases.   Figure 10 shows the comparison of the single-chain and multi-chain resource consumption with different similarity thresholds. It can be seen that with the same similarity threshold, the resource consumption of the multi-chain system was always lower than that of the single-chain system. Because multi-chain deployment comprehensively considers the optimal deployment location of the virtual network to be aggregated, its deployment overhead will be lower than that of single-chain deployment. Regardless of whether it is a single-chain or multi-chain system, as the similarity threshold increased, its resource consumption also gradually rose. When the task similarity threshold gradually increased, as the number of aggregated VNFs continued to decrease, more and more VNFs needed to be instantiated, resulting in an increase in the instantiation overhead and resource consumption. By comparison, with the same similarity threshold, multi-chain deployment was better than single-chain deployment for resource consumption. The similarity threshold is the primary basis for task classification. When the similarity threshold is lower, the aggregation degree of the VNFs will be higher. Although resource consumption will be reduced, it will still cause excessive load unbalancing and affect service quality. When the similarity threshold is higher, the aggregation degree of the VNF will be lower, which makes the resource consumption higher. Therefore, it is necessary to select the appropriate similarity to classify tasks according to the needs of the scene. resource consumption. By comparison, with the same similarity thre deployment was better than single-chain deployment for resource cons ilarity threshold is the primary basis for task classification. When the si is lower, the aggregation degree of the VNFs will be higher. Although r tion will be reduced, it will still cause excessive load unbalancing and a ity. When the similarity threshold is higher, the aggregation degree o lower, which makes the resource consumption higher. Therefore, it is the appropriate similarity to classify tasks according to the needs of the Figure 10. Comparison of the single-chain and multi-chain resource consum similarity thresholds.

Conclusions
This study mainly investigated the load imbalance caused by the u gregation in the deployment of SFC in the air-ground network under tion virtualization technology. A virtual network function (VNF) ag based on task similarity was proposed. Before SFC mapping, the air-g source consumption and load imbalance were considered. A model for be aggregated based on task similarity was established. The tasks were ing to the similarity, and the functions to be aggregated were selecte under the constraints of the underlying physical resources, and the achieved by adjusting the similarity threshold. We adopted the SFC scheme based on network resource awareness and obtained the mos nodes for single-chain and multi-chain mapping according to the va physical network nodes to make full use of the underlying physical re the resource consumption and load imbalance. It was shown that the p solved the load imbalance problem caused by the aggregation of funct larity threshold was adjusted to meet the demand for load balancing in d However, the method of task classification still needs to be improved.

Conclusions
This study mainly investigated the load imbalance caused by the use of function aggregation in the deployment of SFC in the air-ground network under the network function virtualization technology. A virtual network function (VNF) aggregation scheme based on task similarity was proposed. Before SFC mapping, the air-ground network resource consumption and load imbalance were considered. A model for selecting VNFs to be aggregated based on task similarity was established. The tasks were classified according to the similarity, and the functions to be aggregated were selected within the class under the constraints of the underlying physical resources, and the load balance was achieved by adjusting the similarity threshold. We adopted the SFC mapping selection scheme based on network resource awareness and obtained the most suitable physical nodes for single-chain and multi-chain mapping according to the various attributes of physical network nodes to make full use of the underlying physical resources to reduce the resource consumption and load imbalance. It was shown that the proposed algorithm solved the load imbalance problem caused by the aggregation of functions, and the similarity threshold was adjusted to meet the demand for load balancing in different scenarios. However, the method of task classification still needs to be improved. In the future, task classification will be expanded in combination with machine learning methods to better achieve functional aggregation.